Primitives-based evaluation and estimation of emotions in speech

نویسندگان

Michael Grimm

Kristian Kroschel

Emily Mower Provost

Shrikanth S. Narayanan

چکیده

Emotion primitive descriptions are an important alternative to classical emotion categories for describing a human’s affective expressions. We build a multi-dimensional emotion space composed of the emotion primitives of valence, activation, and dominance. In this study, an image-based, text-free evaluation system is presented that provides intuitive assessment of these emotion primitives, and yields high inter-evaluator agreement. An automatic system for estimating the emotion primitives is introduced. We use a fuzzy logic estimator and a rule base derived from acoustic features in speech such as pitch, energy, speaking rate and spectral characteristics. The approach is tested on two databases. The first database consists of 680 sentences of 3 speakers containing acted emotions in the categories happy, angry, neutral, and sad. The second database contains more than 1000 utterances of 47 speakers with authentic emotion expressions recorded from a television talk show. The estimation results are compared to the human evaluation as a reference, and are moderately to highly correlated (0.42 < r < 0.85). Different scenarios are tested: acted vs. authentic emotions, speaker-dependent vs. speaker-independent emotion estimation, and gender-dependent vs. gender-independent emotion estimation. Finally, continuous-valued estimates of the emotion primitives are mapped into the given emotion categories using a k-nearest neighbor classifier. An overall recognition rate of up to 83.5% is accomplished. The errors of the direct emotion estimation are compared to the confusion matrices of the classification from primitives. As a conclusion to this continuous-valued emotion primitives framework, speaker-dependent modeling of emotion expression is proposed since the emotion primitives are particularly suited for capturing dynamics and intrinsic variations in emotion expression. 2007 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic Emotion Recognition in Car Environment Using a 3D Emotion Space Approach

Introduction The automatic assessment of emotions conveyed in the speech signal has become a rapidly growing research interest in recent years. This paper focuses on a generalized framework to estimate emotions from the speech using an emotion space concept. The performance of such a system is studied in the acoustically demanding environment of vehicular noise while driving. Due to the increas...

متن کامل

Speaker and Listener Variations in Emotion Assessment

Introduction In this paper we discuss both the speaker dependent and the listener dependent aspects in the assessment of emotions in speech. These dependencies form a basis to improve current emotion recognition systems as they can be applied in man-machine interaction, for instance. Emotion recognition in speech has gained much attention in recent years [1, 2, 3]. However, human evaluation of ...

متن کامل

Modeling Emotion Expression and Perception Behavior in Auditive Emotion Evaluation

In this paper, we consider both speaker dependent and listener dependent aspects in the assessment of emotions in speech. We model the speaker dependencies in emotional speech production by two parameters which describe the individual’s emotional expression behavior. Similarly, we model the listener’s emotion perception behavior by a simple parametric model. These models form a basis for improv...

متن کامل

Multilingual Speech Emotion Recognition System Based on a Three-Layer Model

Speech Emotion Recognition (SER) systems currently are focusing on classifying emotions on each single language. Since optimal acoustic sets are strongly language dependent, to achieve a generalized SER system working for multiple languages, issues of selection of common features and retraining are still challenging. In this paper, we therefore present a SER system in a multilingual scenario fr...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Speech Communication

دوره 49 شماره

صفحات -

تاریخ انتشار 2007

Primitives-based evaluation and estimation of emotions in speech

نویسندگان

چکیده

منابع مشابه

Acoustic Emotion Recognition in Car Environment Using a 3D Emotion Space Approach

Speaker and Listener Variations in Emotion Assessment

Modeling Emotion Expression and Perception Behavior in Auditive Emotion Evaluation

Multilingual Speech Emotion Recognition System Based on a Three-Layer Model

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

عنوان ژورنال:

اشتراک گذاری